In todays world, businesses are shifting online the information availability has increased exponentially. Using this information, I can gain some powerful insights by processing, analyzing and visualizing the data available to us. With the developments and innovations in the industry, this vast amount of data is available to us on various devices such as cell phones, tablets, laptops and computers.
A Cell Phone is one of the most essential things to us now-a-days as we are mostly dependent on it as it keeps us connected to people, helps in getting our tasks done, provides sources of entertainment and so on. Choosing the best Cell Phone is very essential as it is the one thing that people spend most of their time on.
Amazon is one of the largest and most successful retailers for electronic devices. Most people buy electronic devices such as Cell phones from Amazon. The reviews on Amazon are essential because they help retailers and manufacturers know why a customer dislikes or likes a product, saving them millions of dollars each year from conducting customer surveys.
In this project, I help customers choose the best Cell Phones by analyzing the reviews posted on www.amazon.com. for different cellphone brands such as Apple, Motorola, Samsung, Xiaomi, etc., to gain valuable insights on the different cellphone brands and visualize the data to identify the various patterns and trends in the reviews for each cellphone brand. Furthermore, I perform sentiment analysis on the reviews for each cellphone brand in order to analyze the sentiments of the reviews based on whether the customers like or dislike the cellphones. Based on the sentiment analysis, I analyze the sentiments of the reviews to identify any similarities or differences in the sentiments/opinions of the different users for the different cellphone brands or any particular cellphone model.
items.csv and reviews.csv# Import the python pandas library
import pandas as pd
import numpy as np
from numpy import nan as NA
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
nltk.download("vader_lexicon")
nltk.download('stopwords')
nltk.download('wordnet')
import imageio
from wordcloud import WordCloud
from wordcloud import STOPWORDS
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
stopwords = STOPWORDS
%matplotlib inline
[nltk_data] Downloading package vader_lexicon to [nltk_data] C:\Users\Aditya\AppData\Roaming\nltk_data... [nltk_data] Package vader_lexicon is already up-to-date! [nltk_data] Downloading package stopwords to [nltk_data] C:\Users\Aditya\AppData\Roaming\nltk_data... [nltk_data] Package stopwords is already up-to-date! [nltk_data] Downloading package wordnet to [nltk_data] C:\Users\Aditya\AppData\Roaming\nltk_data... [nltk_data] Package wordnet is already up-to-date!
items.csv and reviews.csvasin to product_id which will be lated used to identify each individual product# Load the items dataset
items_df = pd.read_csv("items.csv")
items_df.rename(columns = {"asin": "product_id"}, inplace = True)
items_df
| product_id | brand | title | url | image | rating | reviewUrl | totalReviews | price | originalPrice | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.00 | 0.00 |
| 1 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | 0.00 |
| 2 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | 0.00 |
| 3 | B001AO4OUC | Motorola | Motorola i335 Cell Phone Boost Mobile | https://www.amazon.com/Motorola-i335-Phone-Boo... | https://m.media-amazon.com/images/I/710UO8gdT+... | 3.3 | https://www.amazon.com/product-reviews/B001AO4OUC | 21 | 0.00 | 0.00 |
| 4 | B001DCJAJG | Motorola | Motorola V365 no contract cellular phone AT&T | https://www.amazon.com/Motorola-V365-contract-... | https://m.media-amazon.com/images/I/61LYNCVrrK... | 3.1 | https://www.amazon.com/product-reviews/B001DCJAJG | 12 | 149.99 | 0.00 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 715 | B07ZPKZSSC | Apple | Apple iPhone 11 Pro, 64GB, Fully Unlocked - Sp... | https://www.amazon.com/Apple-iPhone-64GB-Fully... | https://m.media-amazon.com/images/I/41wDuEW9iZ... | 1.0 | https://www.amazon.com/product-reviews/B07ZPKZSSC | 1 | 949.00 | 0.00 |
| 716 | B07ZQSGP53 | Xiaomi | Xiaomi Redmi Note 8, 32GB/3GB RAM 6.3" FHD+ Di... | https://www.amazon.com/Xiaomi-Display-Snapdrag... | https://m.media-amazon.com/images/I/41foh4FKHE... | 4.6 | https://www.amazon.com/product-reviews/B07ZQSGP53 | 3 | 150.96 | 0.00 |
| 717 | B081H6STQQ | Sony | Sony Xperia 1 Unlocked Smartphone and WH1000XM... | https://www.amazon.com/Sony-Smartphone-WH1000X... | https://m.media-amazon.com/images/I/51zZTAXZTP... | 4.5 | https://www.amazon.com/product-reviews/B081H6STQQ | 70 | 948.00 | 0.00 |
| 718 | B081TJFVCJ | Apple | Apple iPhone X, 64GB, Gray - Fully Unlocked (R... | https://www.amazon.com/Apple-iPhone-64GB-Gray-... | https://m.media-amazon.com/images/I/71yMgOenT5... | 5.0 | https://www.amazon.com/product-reviews/B081TJFVCJ | 1 | 478.97 | 0.00 |
| 719 | B0825BB7SG | Samsung | Straight Talk Samsung Galaxy A10e Smartphone 5... | https://www.amazon.com/Straight-Samsung-Galaxy... | https://m.media-amazon.com/images/I/81+3SWSAhD... | 5.0 | https://www.amazon.com/product-reviews/B0825BB7SG | 1 | 139.00 | 139.26 |
720 rows × 10 columns
# Load the reviews dataset
reviews_df = pd.read_csv("reviews.csv")
reviews_df.rename(columns = {"asin": "product_id"}, inplace = True)
reviews_df
| product_id | name | rating | date | verified | title | body | helpfulVotes | |
|---|---|---|---|---|---|---|---|---|
| 0 | B0000SX2UC | Janet | 3 | October 11, 2005 | False | Def not best, but not worst | I had the Samsung A600 for awhile which is abs... | 1.0 |
| 1 | B0000SX2UC | Luke Wyatt | 1 | January 7, 2004 | False | Text Messaging Doesn't Work | Due to a software issue between Nokia and Spri... | 17.0 |
| 2 | B0000SX2UC | Brooke | 5 | December 30, 2003 | False | Love This Phone | This is a great, reliable phone. I also purcha... | 5.0 |
| 3 | B0000SX2UC | amy m. teague | 3 | March 18, 2004 | False | Love the Phone, BUT...! | I love the phone and all, because I really did... | 1.0 |
| 4 | B0000SX2UC | tristazbimmer | 4 | August 28, 2005 | False | Great phone service and options, lousy case! | The phone has been great for every purpose it ... | 1.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 67981 | B081H6STQQ | jande | 5 | August 16, 2019 | False | Awesome Phone, but finger scanner is a big mis... | I love the camera on this phone. The screen is... | 1.0 |
| 67982 | B081H6STQQ | 2cool4u | 5 | September 14, 2019 | False | Simply Amazing! | I've been an Xperia user for several years and... | 1.0 |
| 67983 | B081H6STQQ | simon | 5 | July 14, 2019 | False | great phon3, but many bugs need to fix. still ... | buy one more for my cousin | NaN |
| 67984 | B081TJFVCJ | Tobiasz Jedrysiak | 5 | December 24, 2019 | True | Phone is like new | Product looks and works like new. Very much re... | NaN |
| 67985 | B0825BB7SG | Owen Gonzalez | 5 | December 11, 2019 | False | Outstanding phone for the price | I love the size and style of this phone. It is... | NaN |
67986 rows × 8 columns
items.csv and review.csv on the common column in each dataframe which is product_id using the merge() functionproject_df# Merge the two dataframes
project_df = pd.merge(items_df, reviews_df, how = "right", on = "product_id")
project_df
| product_id | brand | title_x | url | image | rating_x | reviewUrl | totalReviews | price | originalPrice | name | rating_y | date | verified | title_y | body | helpfulVotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.00 | 0.00 | Janet | 3 | October 11, 2005 | False | Def not best, but not worst | I had the Samsung A600 for awhile which is abs... | 1.0 |
| 1 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.00 | 0.00 | Luke Wyatt | 1 | January 7, 2004 | False | Text Messaging Doesn't Work | Due to a software issue between Nokia and Spri... | 17.0 |
| 2 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.00 | 0.00 | Brooke | 5 | December 30, 2003 | False | Love This Phone | This is a great, reliable phone. I also purcha... | 5.0 |
| 3 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.00 | 0.00 | amy m. teague | 3 | March 18, 2004 | False | Love the Phone, BUT...! | I love the phone and all, because I really did... | 1.0 |
| 4 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.00 | 0.00 | tristazbimmer | 4 | August 28, 2005 | False | Great phone service and options, lousy case! | The phone has been great for every purpose it ... | 1.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 67981 | B081H6STQQ | Sony | Sony Xperia 1 Unlocked Smartphone and WH1000XM... | https://www.amazon.com/Sony-Smartphone-WH1000X... | https://m.media-amazon.com/images/I/51zZTAXZTP... | 4.5 | https://www.amazon.com/product-reviews/B081H6STQQ | 70 | 948.00 | 0.00 | jande | 5 | August 16, 2019 | False | Awesome Phone, but finger scanner is a big mis... | I love the camera on this phone. The screen is... | 1.0 |
| 67982 | B081H6STQQ | Sony | Sony Xperia 1 Unlocked Smartphone and WH1000XM... | https://www.amazon.com/Sony-Smartphone-WH1000X... | https://m.media-amazon.com/images/I/51zZTAXZTP... | 4.5 | https://www.amazon.com/product-reviews/B081H6STQQ | 70 | 948.00 | 0.00 | 2cool4u | 5 | September 14, 2019 | False | Simply Amazing! | I've been an Xperia user for several years and... | 1.0 |
| 67983 | B081H6STQQ | Sony | Sony Xperia 1 Unlocked Smartphone and WH1000XM... | https://www.amazon.com/Sony-Smartphone-WH1000X... | https://m.media-amazon.com/images/I/51zZTAXZTP... | 4.5 | https://www.amazon.com/product-reviews/B081H6STQQ | 70 | 948.00 | 0.00 | simon | 5 | July 14, 2019 | False | great phon3, but many bugs need to fix. still ... | buy one more for my cousin | NaN |
| 67984 | B081TJFVCJ | Apple | Apple iPhone X, 64GB, Gray - Fully Unlocked (R... | https://www.amazon.com/Apple-iPhone-64GB-Gray-... | https://m.media-amazon.com/images/I/71yMgOenT5... | 5.0 | https://www.amazon.com/product-reviews/B081TJFVCJ | 1 | 478.97 | 0.00 | Tobiasz Jedrysiak | 5 | December 24, 2019 | True | Phone is like new | Product looks and works like new. Very much re... | NaN |
| 67985 | B0825BB7SG | Samsung | Straight Talk Samsung Galaxy A10e Smartphone 5... | https://www.amazon.com/Straight-Samsung-Galaxy... | https://m.media-amazon.com/images/I/81+3SWSAhD... | 5.0 | https://www.amazon.com/product-reviews/B0825BB7SG | 1 | 139.00 | 139.26 | Owen Gonzalez | 5 | December 11, 2019 | False | Outstanding phone for the price | I love the size and style of this phone. It is... | NaN |
67986 rows × 17 columns
project_df using the pandas rename() method# Rename the Columns
project_df.rename(columns = {"title_x": "product_title","rating_x": "product_rating", "rating_y": "review_rating", "title_y": "review_title"}, inplace = True)
project_df.head()
| product_id | brand | product_title | url | image | product_rating | reviewUrl | totalReviews | price | originalPrice | name | review_rating | date | verified | review_title | body | helpfulVotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | Janet | 3 | October 11, 2005 | False | Def not best, but not worst | I had the Samsung A600 for awhile which is abs... | 1.0 |
| 1 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | Luke Wyatt | 1 | January 7, 2004 | False | Text Messaging Doesn't Work | Due to a software issue between Nokia and Spri... | 17.0 |
| 2 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | Brooke | 5 | December 30, 2003 | False | Love This Phone | This is a great, reliable phone. I also purcha... | 5.0 |
| 3 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | amy m. teague | 3 | March 18, 2004 | False | Love the Phone, BUT...! | I love the phone and all, because I really did... | 1.0 |
| 4 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | tristazbimmer | 4 | August 28, 2005 | False | Great phone service and options, lousy case! | The phone has been great for every purpose it ... | 1.0 |
project_df using the duplicated() function# Check for any duplicate rows in the dataframe
print(f"The total number of duplicated rows in the dataframe: {project_df.duplicated().sum()}")
project_df[project_df.duplicated()]
The total number of duplicated rows in the dataframe: 12
| product_id | brand | product_title | url | image | product_rating | reviewUrl | totalReviews | price | originalPrice | name | review_rating | date | verified | review_title | body | helpfulVotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2527 | B00836Y6B2 | Nokia | Nokia Lumia 900 Black Factory Unlocked | https://www.amazon.com/Nokia-Lumia-900-Factory... | https://m.media-amazon.com/images/I/81ZlbLtZ3P... | 3.2 | https://www.amazon.com/product-reviews/B00836Y6B2 | 929 | 109.98 | 0.00 | Dairon Alexander Maiz | 5 | February 13, 2015 | True | exelente | exelente | NaN |
| 3047 | B00836Y6B2 | Nokia | Nokia Lumia 900 Black Factory Unlocked | https://www.amazon.com/Nokia-Lumia-900-Factory... | https://m.media-amazon.com/images/I/81ZlbLtZ3P... | 3.2 | https://www.amazon.com/product-reviews/B00836Y6B2 | 929 | 109.98 | 0.00 | Waygo | 5 | January 18, 2015 | True | Five Stars | good | NaN |
| 3073 | B00836Y6B2 | Nokia | Nokia Lumia 900 Black Factory Unlocked | https://www.amazon.com/Nokia-Lumia-900-Factory... | https://m.media-amazon.com/images/I/81ZlbLtZ3P... | 3.2 | https://www.amazon.com/product-reviews/B00836Y6B2 | 929 | 109.98 | 0.00 | EDWIN MORENO | 5 | July 19, 2014 | True | Five Stars | good | NaN |
| 6985 | B00E6FGSHY | Samsung | Samsung Galaxy S4, White Frost 16GB (AT&T) | https://www.amazon.com/Samsung-Galaxy-S4-White... | https://m.media-amazon.com/images/I/81suaO+v0m... | 3.7 | https://www.amazon.com/product-reviews/B00E6FGSHY | 641 | 154.97 | 0.00 | Northwest | 5 | February 26, 2015 | True | Five Stars | Excellent phone. I love everything about it, s... | NaN |
| 8378 | B00F2SKPIM | Samsung | Samsung Galaxy Note 3, Black 32GB (Verizon Wir... | https://www.amazon.com/Samsung-Galaxy-Note-Ver... | https://m.media-amazon.com/images/I/91eFtaIWpc... | 3.9 | https://www.amazon.com/product-reviews/B00F2SKPIM | 983 | 0.00 | 0.00 | Duncan K. | 5 | March 6, 2014 | True | Fantastic phone | i went from a s4 to the note 3 and all i can s... | 1.0 |
| 14917 | B00OZTSY6Y | Motorola | Motorola DROID Turbo XT1254, Black Ballistic N... | https://www.amazon.com/Motorola-DROID-Turbo-Ba... | https://m.media-amazon.com/images/I/81VMO0Upxf... | 3.6 | https://www.amazon.com/product-reviews/B00OZTSY6Y | 545 | 149.99 | 0.00 | B. KESLER | 5 | March 13, 2015 | False | Best Phone on Verizon Right Now | I have long been a Motorola user, dating back ... | 1.0 |
| 15628 | B00V7FY44A | Samsung | Samsung Galaxy S6 Edge, White Pearl 32GB (Veri... | https://www.amazon.com/Samsung-Galaxy-S6-Edge-... | https://m.media-amazon.com/images/I/814WwhaRVu... | 3.5 | https://www.amazon.com/product-reviews/B00V7FY44A | 505 | 0.00 | 0.00 | Domingo Gomes - NewEsc | 5 | May 10, 2015 | False | Five Stars | This is the best android Phone ever!!! I love ... | NaN |
| 18163 | B014GCG150 | Samsung | Samsung Galaxy J5 SM-J500H/DS GSM Factory Unlo... | https://www.amazon.com/Samsung-SM-J500H-Unlock... | https://m.media-amazon.com/images/I/61alJun3Jv... | 4.1 | https://www.amazon.com/product-reviews/B014GCG150 | 645 | 148.98 | 0.00 | Caleb Ajibade | 5 | July 8, 2019 | True | Good | Good | NaN |
| 27123 | B01M0PADR4 | Google Pixel XL G2PW210032GBBK Factory Unlocke... | https://www.amazon.com/Google-G2PW210032GBBK-U... | https://m.media-amazon.com/images/I/71PZz7CQ9U... | 3.4 | https://www.amazon.com/product-reviews/B01M0PADR4 | 429 | 201.48 | 0.00 | anon | 5 | January 10, 2017 | True | Five Stars | Works great | 2.0 | |
| 33455 | B06XYMCMHD | Samsung | Samsung Galaxy S8 SM-G950F Unlocked 64GB - Int... | https://www.amazon.com/Samsung-Galaxy-SM-G950F... | https://m.media-amazon.com/images/I/81FWIR3RbU... | 3.9 | https://www.amazon.com/product-reviews/B06XYMCMHD | 117 | 399.99 | 0.00 | Bulltwinkle | 1 | July 15, 2017 | False | Amazon sells these phones without a US recogni... | Amazon sells these phones without a US recogni... | 3.0 |
| 53176 | B07HK4JNV1 | Xiaomi | Xiaomi Redmi Note 6 Pro 64GB / 4GB RAM 6.26" D... | https://www.amazon.com/Xiaomi-Factory-Unlocked... | https://m.media-amazon.com/images/I/517Q3-wHBk... | 4.3 | https://www.amazon.com/product-reviews/B07HK4JNV1 | 441 | 179.50 | 0.00 | vibhor jain | 5 | April 8, 2019 | True | Nice phone | Awsome | NaN |
| 66713 | B07WSJYDXX | Motorola | Motorola G6 – 32 GB – Unlocked (AT&T/Sprint/T-... | https://www.amazon.com/Motorola-G6-Unlocked-T-... | https://m.media-amazon.com/images/I/51L6DbMbvK... | 3.9 | https://www.amazon.com/product-reviews/B07WSJYDXX | 836 | 119.99 | 249.99 | susan | 4 | July 15, 2019 | True | Good. | Good. | NaN |
drop_duplicates() method# Drop the duplicate rows
project_df.drop_duplicates(inplace = True)
project_df.head()
| product_id | brand | product_title | url | image | product_rating | reviewUrl | totalReviews | price | originalPrice | name | review_rating | date | verified | review_title | body | helpfulVotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | Janet | 3 | October 11, 2005 | False | Def not best, but not worst | I had the Samsung A600 for awhile which is abs... | 1.0 |
| 1 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | Luke Wyatt | 1 | January 7, 2004 | False | Text Messaging Doesn't Work | Due to a software issue between Nokia and Spri... | 17.0 |
| 2 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | Brooke | 5 | December 30, 2003 | False | Love This Phone | This is a great, reliable phone. I also purcha... | 5.0 |
| 3 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | amy m. teague | 3 | March 18, 2004 | False | Love the Phone, BUT...! | I love the phone and all, because I really did... | 1.0 |
| 4 | B0000SX2UC | NaN | Dual-Band / Tri-Mode Sprint PCS Phone w/ Voice... | https://www.amazon.com/Dual-Band-Tri-Mode-Acti... | https://m.media-amazon.com/images/I/2143EBQ210... | 3.0 | https://www.amazon.com/product-reviews/B0000SX2UC | 14 | 0.0 | 0.0 | tristazbimmer | 4 | August 28, 2005 | False | Great phone service and options, lousy case! | The phone has been great for every purpose it ... | 1.0 |
# Check for null values in each column in the dataframe
project_df.isnull().sum()
product_id 0 brand 200 product_title 0 url 0 image 0 product_rating 0 reviewUrl 0 totalReviews 0 price 0 originalPrice 0 name 2 review_rating 0 date 0 verified 0 review_title 14 body 21 helpfulVotes 40763 dtype: int64
project_dfbrand, name, review_title, body and helpfulVotes have null values and thus, I need to delete the rows with the null values as these rows will be useless for us in our analysishelpfulVotes column because it has a lot of null values and will be useless in our analysisoriginalPrice column because it contains missing data which is represented as 0.0 which does not add any value to our analysis# Delete the rows with null values in the columns: 'brand', 'name', 'review_title' and 'body'
project_df.dropna(subset = ['brand', 'name', 'review_title', 'body'], inplace = True)
project_df.drop(columns = ["helpfulVotes", "originalPrice"], axis = 1, inplace = True)
# Reset the index
project_df.reset_index(drop = True, inplace = True)
project_df
| product_id | brand | product_title | url | image | product_rating | reviewUrl | totalReviews | price | name | review_rating | date | verified | review_title | body | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Marcel Thomas | 1 | March 5, 2016 | True | Stupid phone | DON'T BUY OUT OF SERVICE |
| 1 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | William B. | 4 | February 9, 2006 | False | Exellent Service | I have been with nextel for nearly a year now ... |
| 2 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | K. Mcilhargey | 5 | February 7, 2006 | False | I love it | I just got it and have to say its easy to use,... |
| 3 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Stephen Cahill | 1 | December 20, 2016 | True | Phones locked | 1 star because the phones locked so I have to ... |
| 4 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Mihir | 5 | December 13, 2009 | True | Excellent product | The product has been very good. I had used thi... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 67737 | B081H6STQQ | Sony | Sony Xperia 1 Unlocked Smartphone and WH1000XM... | https://www.amazon.com/Sony-Smartphone-WH1000X... | https://m.media-amazon.com/images/I/51zZTAXZTP... | 4.5 | https://www.amazon.com/product-reviews/B081H6STQQ | 70 | 948.00 | jande | 5 | August 16, 2019 | False | Awesome Phone, but finger scanner is a big mis... | I love the camera on this phone. The screen is... |
| 67738 | B081H6STQQ | Sony | Sony Xperia 1 Unlocked Smartphone and WH1000XM... | https://www.amazon.com/Sony-Smartphone-WH1000X... | https://m.media-amazon.com/images/I/51zZTAXZTP... | 4.5 | https://www.amazon.com/product-reviews/B081H6STQQ | 70 | 948.00 | 2cool4u | 5 | September 14, 2019 | False | Simply Amazing! | I've been an Xperia user for several years and... |
| 67739 | B081H6STQQ | Sony | Sony Xperia 1 Unlocked Smartphone and WH1000XM... | https://www.amazon.com/Sony-Smartphone-WH1000X... | https://m.media-amazon.com/images/I/51zZTAXZTP... | 4.5 | https://www.amazon.com/product-reviews/B081H6STQQ | 70 | 948.00 | simon | 5 | July 14, 2019 | False | great phon3, but many bugs need to fix. still ... | buy one more for my cousin |
| 67740 | B081TJFVCJ | Apple | Apple iPhone X, 64GB, Gray - Fully Unlocked (R... | https://www.amazon.com/Apple-iPhone-64GB-Gray-... | https://m.media-amazon.com/images/I/71yMgOenT5... | 5.0 | https://www.amazon.com/product-reviews/B081TJFVCJ | 1 | 478.97 | Tobiasz Jedrysiak | 5 | December 24, 2019 | True | Phone is like new | Product looks and works like new. Very much re... |
| 67741 | B0825BB7SG | Samsung | Straight Talk Samsung Galaxy A10e Smartphone 5... | https://www.amazon.com/Straight-Samsung-Galaxy... | https://m.media-amazon.com/images/I/81+3SWSAhD... | 5.0 | https://www.amazon.com/product-reviews/B0825BB7SG | 1 | 139.00 | Owen Gonzalez | 5 | December 11, 2019 | False | Outstanding phone for the price | I love the size and style of this phone. It is... |
67742 rows × 15 columns
Date column to a DateTime object in order to extract only the year of the review.# Convert date column to a DateTime object
project_df["date"] = pd.to_datetime(project_df["date"])
project_df["review_year"] = project_df["date"].apply(lambda date: date.year)
project_df.head()
| product_id | brand | product_title | url | image | product_rating | reviewUrl | totalReviews | price | name | review_rating | date | verified | review_title | body | review_year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Marcel Thomas | 1 | 2016-03-05 | True | Stupid phone | DON'T BUY OUT OF SERVICE | 2016 |
| 1 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | William B. | 4 | 2006-02-09 | False | Exellent Service | I have been with nextel for nearly a year now ... | 2006 |
| 2 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | K. Mcilhargey | 5 | 2006-02-07 | False | I love it | I just got it and have to say its easy to use,... | 2006 |
| 3 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Stephen Cahill | 1 | 2016-12-20 | True | Phones locked | 1 star because the phones locked so I have to ... | 2016 |
| 4 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Mihir | 5 | 2009-12-13 | True | Excellent product | The product has been very good. I had used thi... | 2009 |
groupby() function# Determine the number of cellphones for each brand
items_df.groupby(by = "brand")["title"].count()
brand ASUS 5 Apple 63 Google 38 HUAWEI 32 Motorola 105 Nokia 44 OnePlus 10 Samsung 346 Sony 27 Xiaomi 46 Name: title, dtype: int64
# Determine the average product ratings for each cellphone brand
items_df.groupby(by = "brand")["rating"].mean().sort_values(ascending = False)
brand Xiaomi 4.415217 HUAWEI 4.021875 ASUS 3.860000 Sony 3.788889 Apple 3.782540 Google 3.771053 Motorola 3.643810 Samsung 3.632659 OnePlus 3.580000 Nokia 3.386364 Name: rating, dtype: float64
project_df there are some values in the Price column in the dataframe which have the values 0 and 1, but I didn't drop those rows because they contain the reviews which are more important for our analysis.price_df which doesn't contain the rows with cell phone prices with 0 and 1px.scatter() function.# Create a new price dataframe with correct prices
price_df = project_df[(project_df['price']!=0) & (project_df['price']!=1)]
# Plot the scatterplot for price vs product rating for each cellphone brand
fig = px.scatter(price_df, x="product_rating", y="price", color='brand', title = "Price v/s Product_Rating")
fig.show()
The plot above is showing us correlation between the Price and the Produt Rating. I can observe that for products with lower prices the rating has a wide range from 2-5 but as the price of the product increases the product range keeps decreasing towards the higher rating side.
Thus, I can infer from this plot that as the product price increases people are more satisfied with their phones and they have lower complaints. Or it may be the case that people buying high priced phones consider it as an investment and research before buying the product as compared to the people who buy low priced phones thus they tend to be more satisfied with their product.
I can also observe that Samsung has the highest number of ratings and people have only reviewed Samsung and Sony for high priced phones.
sns.lineplot() function# Determine the average product_rating for each brand and every year
data_rating_year=project_df.groupby(['brand', 'review_year'])[['product_rating']].mean().reset_index()
data_rating_year
| brand | review_year | product_rating | |
|---|---|---|---|
| 0 | ASUS | 2018 | 3.720468 |
| 1 | ASUS | 2019 | 3.945000 |
| 2 | Apple | 2015 | 2.800000 |
| 3 | Apple | 2016 | 3.180000 |
| 4 | Apple | 2017 | 3.878531 |
| ... | ... | ... | ... |
| 59 | Sony | 2017 | 3.655370 |
| 60 | Sony | 2018 | 3.728286 |
| 61 | Sony | 2019 | 3.864698 |
| 62 | Xiaomi | 2018 | 4.248000 |
| 63 | Xiaomi | 2019 | 4.393216 |
64 rows × 3 columns
# Plot the line graph for product_rating for each brand accross years
sns.set_style('darkgrid') # Set the grid style
sns.set(rc = {'figure.figsize':(20,10)}) # Set the figure size
sns.set_context('talk') # Set context
ratings_plot=sns.lineplot(data=data_rating_year, x='review_year', y='product_rating', hue='brand', style='brand', markers=True, dashes=False)
ratings_plot.set_title('Average Product Ratings by Year') # Set the title
ratings_plot
<AxesSubplot:title={'center':'Average Product Ratings by Year'}, xlabel='review_year', ylabel='product_rating'>
NLTK VADERproject_df for both the columns review_title and bodySentimentIntensityAnalyzer() function in the NLTK Vader Module and use apply() function to calculate the compound of polarity scores for each row review using the polarity_scores() function# Create the Sentiment Score columns for review title and body with the NLTK Vader Sentiment Analysis
analyzer = SentimentIntensityAnalyzer()
project_df['Title_Sentiment_Score'] = project_df['review_title'].apply(lambda x: analyzer.polarity_scores(x)['compound'])
project_df['Body_Sentiment_Score'] = project_df['body'].apply(lambda x: analyzer.polarity_scores(x)['compound'])
project_df.head(20)
| product_id | brand | product_title | url | image | product_rating | reviewUrl | totalReviews | price | name | review_rating | date | verified | review_title | body | review_year | Title_Sentiment_Score | Body_Sentiment_Score | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Marcel Thomas | 1 | 2016-03-05 | True | Stupid phone | DON'T BUY OUT OF SERVICE | 2016 | -0.5267 | 0.0000 |
| 1 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | William B. | 4 | 2006-02-09 | False | Exellent Service | I have been with nextel for nearly a year now ... | 2006 | 0.0000 | 0.8658 |
| 2 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | K. Mcilhargey | 5 | 2006-02-07 | False | I love it | I just got it and have to say its easy to use,... | 2006 | 0.6369 | -0.0516 |
| 3 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Stephen Cahill | 1 | 2016-12-20 | True | Phones locked | 1 star because the phones locked so I have to ... | 2016 | 0.0000 | -0.1689 |
| 4 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Mihir | 5 | 2009-12-13 | True | Excellent product | The product has been very good. I had used thi... | 2009 | 0.5719 | 0.8777 |
| 5 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | L. Hughes | 1 | 2005-07-21 | False | WARNING | My problems with nextel did not stop when I ca... | 2005 | -0.3400 | -0.7524 |
| 6 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | 1 Stop 4 Whats HOT | 5 | 2009-06-27 | False | NEXTEL BOOST PHONE | GREAT PRODUCT THAT IS AS GREAT FOR NEXTEL AS I... | 2009 | 0.4019 | 0.9041 |
| 7 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Thomas | 4 | 2010-09-17 | True | Nice, but | I bought this phone to replace an LG phone tha... | 2010 | 0.2263 | 0.9389 |
| 8 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Kei, San Jose, CA | 1 | 2017-05-13 | True | It seems it doesn't work with the existing AT&... | I purchased this phone for my AT&T phone repla... | 2017 | 0.0000 | 0.0000 |
| 9 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Kristy | 1 | 2019-03-13 | True | Supply are needed | The phone did not come with a charger and didn... | 2019 | 0.0000 | 0.0000 |
| 10 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | MARIO GAUTIER | 5 | 2017-05-01 | True | Five Stars | SERVED ME WELL AS A BACK UP PHONE. | 2017 | 0.0000 | 0.2732 |
| 11 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | R-Dash | 3 | 2009-02-10 | True | does the job | I got this phone just as secondary cell phone.... | 2009 | 0.0000 | 0.2382 |
| 12 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | John R. Risden | 4 | 2011-01-19 | True | Awesome with a But!! | Sturdy - clarity is great - easy to use Only p... | 2011 | 0.4826 | 0.4404 |
| 13 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Amazon Customer | 1 | 2017-02-03 | True | One Star | Phone stoped working | 2017 | 0.0000 | 0.0000 |
| 14 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | New York | 5 | 2013-05-23 | True | Is cheap but ok quality | It does a beautiful job. I have used this item... | 2013 | 0.4215 | 0.8694 |
| 15 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | kalani | 3 | 2014-07-22 | True | Three Stars | way too small | 2014 | 0.0000 | 0.0000 |
| 16 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Trevor Arms | 1 | 2014-02-11 | True | This phone gave me a concussion. and may come ... | I asked my friend to toss me my phone, but wha... | 2014 | -0.3182 | 0.9858 |
| 17 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | zzzzzzzzzz | 5 | 2008-04-10 | False | Tough phone | We never use cell phones, but thought we neede... | 2008 | -0.1280 | -0.6476 |
| 18 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Valli LaMar | 5 | 2012-06-21 | False | 4-1/2 years! | I have had this model phone for 4-1/2 years an... | 2012 | 0.0000 | 0.8199 |
| 19 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | S. Chan | 5 | 2007-10-03 | False | simply great! | Easy to use features, slim in size at a very g... | 2007 | 0.6588 | 0.8953 |
project_df in a new columns Title_Sentiment_Score and Body_Sentiment_ScoreBody_Sentiment_TypeBody_Sentiment_Type consists of three categories which classifies the reviews as Positive, Negative and Neutral based on the Body_Sentiment_ScorePositive if the sentiment score is greater than or equal to 0.05, Neutral if the sentiment_score is between 0.05 and -0.05 and Negative if the sentiment score is lower than or equal to -0.05# Creating a categorial variable as Positive, Negative or Neutral for reviews
project_df["Body_Sentiment_Type"] = np.where(project_df["Body_Sentiment_Score"] >= 0.05, "Positive", np.where(((project_df["Body_Sentiment_Score"] < 0.05) & (project_df["Body_Sentiment_Score"] > -0.05)), "Neutral", "Negative"))
project_df.head(15)
| product_id | brand | product_title | url | image | product_rating | reviewUrl | totalReviews | price | name | review_rating | date | verified | review_title | body | review_year | Title_Sentiment_Score | Body_Sentiment_Score | Body_Sentiment_Type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Marcel Thomas | 1 | 2016-03-05 | True | Stupid phone | DON'T BUY OUT OF SERVICE | 2016 | -0.5267 | 0.0000 | Neutral |
| 1 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | William B. | 4 | 2006-02-09 | False | Exellent Service | I have been with nextel for nearly a year now ... | 2006 | 0.0000 | 0.8658 | Positive |
| 2 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | K. Mcilhargey | 5 | 2006-02-07 | False | I love it | I just got it and have to say its easy to use,... | 2006 | 0.6369 | -0.0516 | Negative |
| 3 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Stephen Cahill | 1 | 2016-12-20 | True | Phones locked | 1 star because the phones locked so I have to ... | 2016 | 0.0000 | -0.1689 | Negative |
| 4 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | Mihir | 5 | 2009-12-13 | True | Excellent product | The product has been very good. I had used thi... | 2009 | 0.5719 | 0.8777 | Positive |
| 5 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | L. Hughes | 1 | 2005-07-21 | False | WARNING | My problems with nextel did not stop when I ca... | 2005 | -0.3400 | -0.7524 | Negative |
| 6 | B0009N5L7K | Motorola | Motorola I265 phone | https://www.amazon.com/Motorola-i265-I265-phon... | https://m.media-amazon.com/images/I/419WBAVDAR... | 3.0 | https://www.amazon.com/product-reviews/B0009N5L7K | 7 | 49.95 | 1 Stop 4 Whats HOT | 5 | 2009-06-27 | False | NEXTEL BOOST PHONE | GREAT PRODUCT THAT IS AS GREAT FOR NEXTEL AS I... | 2009 | 0.4019 | 0.9041 | Positive |
| 7 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Thomas | 4 | 2010-09-17 | True | Nice, but | I bought this phone to replace an LG phone tha... | 2010 | 0.2263 | 0.9389 | Positive |
| 8 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Kei, San Jose, CA | 1 | 2017-05-13 | True | It seems it doesn't work with the existing AT&... | I purchased this phone for my AT&T phone repla... | 2017 | 0.0000 | 0.0000 | Neutral |
| 9 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Kristy | 1 | 2019-03-13 | True | Supply are needed | The phone did not come with a charger and didn... | 2019 | 0.0000 | 0.0000 | Neutral |
| 10 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | MARIO GAUTIER | 5 | 2017-05-01 | True | Five Stars | SERVED ME WELL AS A BACK UP PHONE. | 2017 | 0.0000 | 0.2732 | Positive |
| 11 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | R-Dash | 3 | 2009-02-10 | True | does the job | I got this phone just as secondary cell phone.... | 2009 | 0.0000 | 0.2382 | Positive |
| 12 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | John R. Risden | 4 | 2011-01-19 | True | Awesome with a But!! | Sturdy - clarity is great - easy to use Only p... | 2011 | 0.4826 | 0.4404 | Positive |
| 13 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | Amazon Customer | 1 | 2017-02-03 | True | One Star | Phone stoped working | 2017 | 0.0000 | 0.0000 | Neutral |
| 14 | B000SKTZ0S | Motorola | MOTOROLA C168i AT&T CINGULAR PREPAID GOPHONE C... | https://www.amazon.com/MOTOROLA-C168i-CINGULAR... | https://m.media-amazon.com/images/I/71b+q3ydkI... | 2.7 | https://www.amazon.com/product-reviews/B000SKTZ0S | 22 | 99.99 | New York | 5 | 2013-05-23 | True | Is cheap but ok quality | It does a beautiful job. I have used this item... | 2013 | 0.4215 | 0.8694 | Positive |
sns.countplot() function# Plot the Sentiment Types for each brand
plt.figure(figsize = (25,10))
sentiment_plot = sns.countplot(data = project_df, x = "brand", hue = "Body_Sentiment_Type")
plt.title("Sentiment Type for each Brand")
for p in sentiment_plot.patches:
plt.annotate(f'{p.get_height()}', xy=(p.get_x() + p.get_width() / 2, p.get_height()), ha='center', va='bottom')
Samsung are very high as compared to any other brands. This is because maybe the people who bought Samsung phones have better features available and in a good price range as compared to the othe cellphone brandsimageIO library to read an image mask# Read the image
mask_image = imageio.imread('mask_oval.jpeg')
# Create a new data frame with only the reviews and the sentiment type for body
text_df = project_df[['body','Body_Sentiment_Type']].copy()
# Convert positive and negative reviews into long strings that can be analyzed
positive_df = text_df[text_df['Body_Sentiment_Type'] == 'Positive']
positive_text_list = positive_df['body'].values.tolist()
positive_text = ''.join(positive_text_list)
negative_df = text_df[text_df['Body_Sentiment_Type'] == 'Negative']
negative_text_list = negative_df['body'].values.tolist()
negative_text = ''.join(negative_text_list)
# Plot the postive review wordcloud
positive_wordcloud = WordCloud(colormap='prism', mask=mask_image, background_color='white', max_font_size=300, stopwords=stopwords)
positive_wordcloud = positive_wordcloud.generate(positive_text)
plt.figure(figsize=(20,10))
plt.title('Positive Review Word Cloud', fontsize = 32, fontweight = 'bold')
plt.grid(False)
plt.imshow(positive_wordcloud)
<matplotlib.image.AxesImage at 0x20487878a90>
phone, screen, battery, work, great, camera, app etc., thus, I can infer that the the customers who have written the positive reviews really like their phone and describe the phone as well as the features such as battery, camera and screen to be great for the phones that they purchase# Plot the negative review wordcloud
negative_wordcloud = WordCloud(colormap='prism', mask=mask_image, background_color='white', max_font_size = 300, stopwords = stopwords)
negative_wordcloud = negative_wordcloud.generate(negative_text)
plt.figure(figsize=(20,10))
plt.title('Negative Review Word Cloud', fontsize = 32, fontweight = 'bold')
plt.grid(False)
plt.imshow(negative_wordcloud)
<matplotlib.image.AxesImage at 0x2049b6a25b0>
phone, screen, battery, time, work, charge, etc., thus, I can infer that the the customers who have written the negative reviews have issues with the phones that they purchasedVerizon and Samsung are also prominent in the wordcloud, so maybe the customers who purchased the phones with Verizon as the carrier and maybe Verizon services don't work well with Samsung Phones as compared to other brandsBattery for four brands Samsung, Motorola, Apple and Xiaomi because from the above positive and negative reviews wordcloud, the word Battery was very prominent and I can determine the significance of that word for the reviewers.# Create a new dataframe with only product title, body and the Sentiment Type
phone_df = project_df[['brand','product_title','body','Body_Sentiment_Type']]
# Convert the positive and negative review text of Apple, Samsung, Motorola, and Xiaomi into seperate lists
apple_positive_list = phone_df[(phone_df['Body_Sentiment_Type']=='Positive') & (phone_df['brand']=='Apple')]['body'].values.tolist()
apple_negative_list = phone_df[(phone_df['Body_Sentiment_Type']=='Negative') & (phone_df['brand']=='Apple')]['body'].values.tolist()
samsung_positive_list = phone_df[(phone_df['Body_Sentiment_Type']=='Positive') & (phone_df['brand']=='Samsung')]['body'].values.tolist()
samsung_negative_list = phone_df[(phone_df['Body_Sentiment_Type']=='Negative') & (phone_df['brand']=='Samsung')]['body'].values.tolist()
motorola_positive_list = phone_df[(phone_df['Body_Sentiment_Type']=='Positive') & (phone_df['brand']=='Motorola')]['body'].values.tolist()
motorola_negative_list = phone_df[(phone_df['Body_Sentiment_Type']=='Negative') & (phone_df['brand']=='Motorola')]['body'].values.tolist()
xiaomi_positive_list = phone_df[(phone_df['Body_Sentiment_Type']=='Positive') & (phone_df['brand']=='Xiaomi')]['body'].values.tolist()
xiaomi_negative_list = phone_df[(phone_df['Body_Sentiment_Type']=='Negative') & (phone_df['brand']=='Xiaomi')]['body'].values.tolist()
CountVectorizer() function and TfidfTransformer() from the sklearn module# Store the CountVectorizer and the TfidfTransformer in variables
cv = CountVectorizer()
tfidf_transformer = TfidfTransformer()
# Create a list of positive and negative reviews for each of the four brands
positive_data_list = [apple_positive_list, samsung_positive_list, motorola_positive_list, xiaomi_positive_list]
negative_data_list = [apple_negative_list, samsung_negative_list, motorola_negative_list, xiaomi_negative_list]
# Define a list of the 4 brands
brand_list = ['Apple', 'Samsung', 'Motorola', 'Xiaomi']
positive_score = []
negative_score = []
# Calculate the TF-IDF score of the word 'battery' for the positive reviews of each brand
for phone_pos in positive_data_list:
# Convert the list of positive review text into term-frequency matrix
phone_data_pos = cv.fit_transform(phone_pos)
# Convert term-frequency matrix into tf-idf matirx
tfidf_matrix_pos = tfidf_transformer.fit_transform(phone_data_pos)
# Create dictionary to find a tfidf word each word
word2tfidf = dict(zip(cv.get_feature_names(), tfidf_transformer.idf_))
for word, score in word2tfidf.items():
if word in ('battery','Battery'):
positive_score.append(score)
# Similarly, calculate the TF-IDF score of the word 'battery' for the negative reviews of each brand
for phone_neg in negative_data_list:
phone_data_neg = cv.fit_transform(phone_neg)
tfidf_matrix_neg = tfidf_transformer.fit_transform(phone_data_neg)
word2tfidf = dict(zip(cv.get_feature_names(), tfidf_transformer.idf_))
for word, score in word2tfidf.items():
if word in ('battery','Battery'):
negative_score.append(score)
# Make a plot of the TF-IDF value of the word 'battery' in both positive and negative reviews for top 4 brands
brand_name = ['Apple', 'Samsung', 'Motorola', 'Xiaomi']
width = 0.25
x = list(range(len(positive_score)))
plt.figure(figsize=(12,8))
# Create the bars for the TF-IDF scores of positive reviews
positive_plot = plt.bar(x, positive_score, tick_label = brand_name, label = 'Positive', width = width)
for i in range(len(x)):
x[i] += width
# Create the bars for the TF-IDF scores of negative reviews
negative_plot = plt.bar(x, negative_score, tick_label = brand_name, label = 'Negative', width = width)
plt.xticks(size = 20)
plt.yticks(size = 20)
plt.xlabel('Phone Brands', fontsize = 24)
plt.ylabel('TF-IDF Scores', fontsize = 24)
plt.title('TF-IDF Scores for Positive and Negative Reviews on Battery for 4 Brands', fontsize = 24, fontweight = 'bold')
# Annotate the value for each bar
for p in positive_plot.patches:
plt.annotate(f'{p.get_height():.2f}', xy=(p.get_x() + p.get_width() / 2, p.get_height()), ha='center', va='bottom', fontsize = 15)
for p in negative_plot.patches:
plt.annotate(f'{p.get_height():.2f}', xy=(p.get_x() + p.get_width() / 2, p.get_height()), ha='center', va='bottom', fontsize = 15)
plt.legend(fontsize = 15)
plt.show()